3574 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
111 MByte Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:Large-Scale Multi-Label Text Classification on EU Legislation
-
Paper track:Short/Document Analysis
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ilias Chalkidis | EURLEX57K | /N |
Documentation:
The dataset includes documentation in English.
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German Russian
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elena Voita | WMT data | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English Russian
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elena Voita | OpenSubtitles | /N |
Documentation:
None
Written
Treebank,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
None Production Status:
Existing-used
Use:
-
Paper title:Correlating Neural and Symbolic Representations of Language
-
Paper track:Long/Machine Learning
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Grzegorz Chrupała | English Web Treebank | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English French Italian
Availability:
Freely Available
License:
For research purposes
Size:
15 thousand pairs OtherProduction Status:
Newly created-in progress
Use:
Dialogue
-
Paper title:CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yi-Ling Chung | CONAN | /N |
Documentation:
None
Lexicon,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
WordNet License
Size:
60 MByte Production Status:
Existing-used
Use:
Lexicon Creation/Annotation
-
Paper title:Abstractive Text Summarization Based on Deep Learning and Semantic Content Generalization
-
Paper track:Long/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Panagiotis Kouris | Wordnet | /N |
Documentation:
https://wordnet.princeton.edu/documentation
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
None Production Status:
Existing-used
Use:
Summarisation
-
Paper title:Abstractive Text Summarization Based on Deep Learning and Semantic Content Generalization
-
Paper track:Long/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Panagiotis Kouris | Annotated English Gigaword | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
NIST
Size:
12 MByte Production Status:
Existing-used
Use:
Summarisation
-
Paper title:Abstractive Text Summarization Based on Deep Learning and Semantic Content Generalization
-
Paper track:Long/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Panagiotis Kouris | DUC 2004 | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
None Production Status:
Newly created-finished
Use:
Dialogue
-
Paper title:The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raquel Fernández | PhotoBook dataset | /N |
Documentation:
None
Multimodal/Multimedia
Video grounding dataset,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
None MByte Production Status:
Newly created-in progress
Use:
Grounding
-
Paper title:Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
-
Paper track:Long/Vision, Robotics, Multimodal, Grounding and Speec
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Zhenfang Chen | VID-Sentence | /N |
Documentation:
None




